Separating wheat from chaff: Diatom taxon selection using an artificial neural network pruning algorithm

نویسنده

  • Yves T. Prairie
چکیده

This study addresses the question of what diatom taxa to include in a modern calibration set based on their relative contribution in a palaeolimnological calibration model. Using a pruning algorithm for Artificial Neural Networks (ANNs) which determines the functionality of individual taxa in terms of model performance, we pruned the Surface Water Acidification Project (SWAP) pH-diatom data-set until the predictive performance of the pruned set (as assessed by a jackknifing procedure) was statistically different from the initial full-set. Our results, based on the validation at each 5% data-set reduction, show that (i) 85% of the taxa can be removed without any effect on the pH model calibration performance, and (ii) that the complexity and the dimensionality reduction of the model by the removal of these non-essential or redundant taxa greatly improve the robustness of the calibration. A comparison between the commonly used ‘‘marginal’’ criteria for inclusion (species tolerance and Hill’s N2) and our functionality criterion shows that the importance of each taxon in an ANN palaeolimnological model calibration does not appear to depend on these marginal characteristics. Introduction grees, transparency as to the way information is extracted from the assemblage data and implemented Several types of algorithm have been proposed to in the predictive model. While it is clear that the develop quantitative inference models in palaeolimpredictive ability of these models can depend on the nology (Birks 1995): Weighted Averaging regression statistical characteristics of the calibration set (dis/calibration (WA) (ter Braak and van Dam 1989; tribution and range of the environmental variable, Birks et al. 1990), Weighted Averaging Partial Least number of samples, number of taxa, etc.), the modelSquare regression (WA-PLS) (ter Braak and Juggins ling approach is also important to the final success of 1993), Gaussian regression and maximum likelihood the model. Although some methods have been shown calibration (ter Braak and van Dam 1989; ter Braak et to outperform others in certain conditions (ter Braak al. 1993; Vasko et al. 2000), and back-propagation and Juggins 1993; ter Braak et al. 1993; ter Braak (BP) (Rumelhart et al. 1986) of Artificial Neural 1995; Racca et al. 2001), little is known about the Networks (ANNs) (Racca et al. 2001). All of these inclusion or exclusion of taxa based on their contribumethods have inherent but different abilities to model tion to the calibration model. Generally, calibration the complex relations between taxon assemblages and data-sets are large and sparse and the criterion for environmental variables and all yield successful pretaxon inclusion is typically ad hoc (e.g., all taxa with dictive models. However, they lack, to varying de1% relative abundance in at least one sample, present

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Effective Feature Selection for Pre-Cancerous Cervix Lesions Using Artificial Neural Networks

Since most common form of cervical cancer starts with pre-cancerous changes, a flawless detection of these changes becomes an important issue to prevent and treat the cervix cancer. There are 2 ways to stop this disease from developing. One way is to find and treat pre-cancers before they become true cancers, and the other is to prevent the pre-cancers in the first place. The presented approach...

متن کامل

Adaptive Predictive Controllers Using a Growing and Pruning RBF Neural Network

An adaptive version of growing and pruning RBF neural network has been used to predict the system output and implement Linear Model-Based Predictive Controller (LMPC) and Non-linear Model-based Predictive Controller (NMPC) strategies. A radial-basis neural network with growing and pruning capabilities is introduced to carry out on-line model identification.An Unscented Kal...

متن کامل

Identifying Flow Units Using an Artificial Neural Network Approach Optimized by the Imperialist Competitive Algorithm

The spatial distribution of petrophysical properties within the reservoirs is one of the most important factors in reservoir characterization. Flow units are the continuous body over a specific reservoir volume within which the geological and petrophysical properties are the same. Accordingly, an accurate prediction of flow units is a major task to achieve a reliable petrophysical description o...

متن کامل

Optimizing of Iron Bioleaching from a Contaminated Kaolin Clay by the Use of Artificial Neural Network

In this research, the amount of Iron removal by bioleaching of a kaolin sample with high iron impurity with Aspergillus niger was optimized. In order to study the effect of initial pH, sucrose and spore concentration on iron, oxalic acid and citric acid concentration, more than twenty experiments were performed. The resulted data were utilized to train, validate and test the two layer artificia...

متن کامل

Estimation of Cadmium and Uranium in a stream sediment from Eshtehard region in Iran using an Artificial Neural Network

Considering the importance of Cd and U as pollutants of the environment, this study aims to predict the concentrations of these elements in a stream sediment from the Eshtehard region in Iran by means of a developed artificial neural network (ANN) model. The forward selection (FS) method is used to select the input variables and develop hybrid models by ANN. From 45 input candidates, 13 and 14 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002